perm filename 0[00,BGB] blob sn#046203 filedate 1973-06-05 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00006 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	STANFORD ARTIFICIAL INTELLIGENCE LABORATORY	            MAY 1973.
C00005 00003	Page 1. Illustration: 16 frames of a doll baby on a turn table.
C00006 00004	~F8INTRODUCTION.
C00015 00005	PAGE 3. Illustration: 32 frames of blocks on a turn table.
C00016 00006	~F8NTRODUCTION.
C00020 ENDMK
CāŠ—;
STANFORD ARTIFICIAL INTELLIGENCE LABORATORY	            MAY 1973.
MEMO AIM-199

COMPUTER SCIENCE DEPARTMENT REPORT
NO. CS-362


                    Image Contouring and Comparing.


                          Bruce g. Baumgart


ABSTRACT:

	A contour  image representation  is stated  and an  algorithm
for  converting  a   set  of  digital  television  images  into  this
representation is explained.   The algorithm consists of five  steps:
digital  image  thresholding,    binary image  contouring,    polygon
nesting,     polygon   smoothing,     and  polygon  comparing.     An
implementation of  the algorithm  is the  main routine  of a  program
called CRE;  auxiliary routines provide cart and  turn table control,
TV camera  input,   image  display,    and xerox  printer  output.  A
serendip application  of CRE to  type font contruction  is explained.
Details  about the intended application  of CRE to  the perception of
physical objects will appear in sequels to this paper.



	CONTENTS:

		     Introduction.

		  I. The CRE data structure.

		 II. The CRE algorithm.

		III. Using CRE.

		 IV. Using TVFONT.

		     Postscripts.



This  research was supported by the Advanced Research Projects Agency
   of the Office of the Secretary of Defense under contract SD-183.
Page 1. Illustration: 16 frames of a doll baby on a turn table.
~F8INTRODUCTION.

	The acronym CRE stands both for "Contour,  Region,  Edge" and
for  "Cart's  Eye". CRE  is  a  solution to  the  problem  of finding
contour  edges in  a  set  of  television  pictures  and  of  linking
corresponding edges  from one picture  to the  next.  The  process is
automatic   and  is  intended  to  run  without  human  intervention.
Furthermore,   the process  is bottom  up; there  are no  significant
inputs other than the given  television images.  The output of CRE is
a 2D  contour map  data structure  which is  suitable input  to a  3D
geometric modeling program.

	The overall design  goal for CRE  was to build a  region edge
finding  program that could  be applied  to a sequence  of television
pictures and that  would output a sequence  of line drawings  without
having to know anything about  the content of the images. Furthermore
it was  desired that the line drawings be structured.  The six design
choices that determined the character of CRE are:

	1. Dumb vision rather than model driven vision.
	2. Multi image analysis rather than single image analysis.
	3. Total image structure imposed on edge finding; rather
	   than separate edge finder and image analyzer.
	4. Automatic rather than interactive.
	5. Fixed image window size rather than variable window size.
	6. Machine language rather than higher level language.

	The design  choices are  ordered from  the more strategic  to
the   more  tactical;   the  first   three  choices   being  research
strategies, the  latter  three  choices  being  programming  tactics.
Adopting these  design choices lead  to image contouring  and contour
map structures similar to that of Krakauer[3] and Zahn[4].

	The first design  choice does not  refer to the issue  of how
model dependent a  finished general vision system will be (it will be
quite model dependent),   but rather to the  issue of how one  should
begin  building such  a system.   I  believe that  the best  starting
points are  at the two apparent extremes of nearly total knowledge of
a particular  visual  world or  nearly total  ignorance.   The  first
extreme involves  synthesis (by computer graphics) of  a predicted 2D
image, followed by comparing the predicted and a perceived image  for
slight differences  which are  expected but  not yet  measured.   The
second  extreme involves  anaylsing perceived images  into structures
which can  be readily  compared for  near equality  and measured  for
slight differences;  followed by the  construction of a  3D geometric
model of  the perceived world. The point is that in both cases images
are compared,  and in both cases the 3D  model initially (or finally)
contains specific  numerical data on the geometry  and physics of the
particular world being looked at.
~I1973,835;~F8- 2 -~
PAGE 3. Illustration: 32 frames of blocks on a turn table.
~F8NTRODUCTION.

	The  second design  choice, of  multi  image anaylsis  rather
than single  image analysis, provides a basis  for solving for camera
positions and feature  depths.   The third design  choice solves  (or
rather avoids)  the problem of  integrating an edge  finder's results
into  an image. By using a very  simple edge finder, and by accepting
all the edge found, the  image structure is never lost.   This design
postpones the  problem of interpreting photometric  edges as physical
edges.

	The fourth  choice  is a  resolution not  to  write an  image
processor  that requires  operator  assistance and  parameter tuning.
The fifth choice of the  216 by 288 fixed  window size is a sin  that
proved  surprisingly expedient,  it  is explained  later. A  variable
window  version of CRE at halves,   thirds and other simple fractions
of its present window size will be made at some future date.

	The  final design choice  of using  machine language  was for
the  sake  of  implementing  node  link  data  structures  that   are
processed 100 faster  than LEAP, 10  times faster than  compiled LISP
and  that require significantly  less memory  than similar structures
in either LISP or LEAP. Furthermore machine code assembles  and loads
faster  than  higher  level  languages;   and  machine  code  can  be
extensively fixed and altered without recompiling.

	It  is  my  impression  that  CRE  does  not  raise  any  new
scientific problems;  nor does it  have any  really new solutions  to
the old  problems; rather CRE is another  competent video region edge
finding program with its  own set of tricks.  However, it is  further
my impression that the particular tricks for  smoothing,  nesting and
comparing polygons in CRE are original as programming techniques.

	The intended use  of CRE is  illustrated by the  sequences of
turn  table  pictures  on  pages  1  and  3.  The  figure on  page  5
illustrate the quality  of contoured images over  a range of  subject
matter. Some  results of  applying CRE  to typography  is illustrated
below:

~FEChristmas  Font

     A B C D E F G H I J K L M N O P Q R S T U V W X Y Z.
     a b c d e f g h i j k l m n o p q r s t u v w x y z.
~F8
ACKNOWLEDGEMENT.

	Tovar Mock assisted me with  the development of the type font
making program,   TVFONT.
~I1973,835;~F8- 4 -~